Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: rewrite snapshot.py as an Ansible module / add support for thin origins #58

Merged
merged 8 commits into from
Jul 15, 2024

Conversation

trgill
Copy link
Collaborator

@trgill trgill commented May 20, 2024

Enhancement: rewrite snapshot.py as an Ansible module / add support for thin origins

Reason:
#43
#57

Result:
Referenced issues fixed

Issue Tracker Tickets (Jira or BZ if any):

@trgill trgill requested review from spetrosi and richm as code owners May 20, 2024 03:46
@trgill trgill changed the title Module work rewrite snapshot.py as an Ansible module / add support for thin origins May 20, 2024
library/snapshot.py Fixed Show fixed Hide fixed
library/snapshot.py Fixed Show fixed Hide fixed
library/snapshot.py Fixed Show fixed Hide fixed
tasks/main.yml Outdated Show resolved Hide resolved

- name: Assert no changes for create snapset
assert:
that: not snapshot_cmd["changed"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check won't work. It's impossible to check if include_role resulted in a change in Ansible. There is no proper way to check idempotency other than looking at the playbook output.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richm added a boolean response from the module to indicate if any action was taken. Rich, please let me know what changes I should make.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.ansible.com/ansible/latest/dev_guide/developing_modules_general.html#creating-a-module
The module should return a dict. That dict should have a field called changed which is true if the module changed something or false otherwise.

Copy link
Contributor

@richm richm May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check won't work. It's impossible to check if include_role resulted in a change in Ansible. There is no proper way to check idempotency other than looking at the playbook output.

But in this case, when the role calls this module, it uses register: snapshot_cmd, so the test playbook can check. Otherwise, you are correct - there is no way to get at the internal state. Several of the roles register the module output or otherwise use some sort of internal variable to indicate the changed state of the role for the purpose of testing for idempotency.

@@ -1,22 +1,146 @@
from __future__ import print_function
#!/usr/bin/python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to add python unit tests for this module?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I'll add some unit tests.

library/snapshot.py Fixed Show fixed Hide fixed
library/snapshot.py Fixed Show fixed Hide fixed
library/snapshot.py Fixed Show fixed Hide fixed
@trgill trgill changed the title rewrite snapshot.py as an Ansible module / add support for thin origins feat: rewrite snapshot.py as an Ansible module / add support for thin origins May 22, 2024
@spetrosi
Copy link
Contributor

[citest]

tasks/main.yml Outdated Show resolved Hide resolved
tasks/main.yml Outdated Show resolved Hide resolved
tasks/main.yml Outdated Show resolved Hide resolved
tasks/main.yml Outdated Show resolved Hide resolved
tasks/main.yml Outdated Show resolved Hide resolved
library/snapshot.py Outdated Show resolved Hide resolved


EXAMPLES = r"""
# Create Snapshots of all VGs
Copy link
Contributor

@richm richm May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be an example of using the module directly, not using the role.
Note that using the module directly is not supported, but module scanning tools such as ansible-test and ansible-doc may throw an error if this documentation is not in the correct format.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. The errors are a bit difficult to understand. I was following the example in the storage role. Currently I'm seeing a lot of errors for the docs: https://github.com/linux-system-roles/snapshot/actions/runs/9502379067/job/26190204498 - it looks like the storage role is using a different "-collection-version 1.78.2" vs the snapshot role is using "--collection-version 1.79.0"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tried following the examples in: https://docs.ansible.com/ansible/latest/dev_guide/developing_modules_general.html

But had trouble with errors with that format too.

tasks/main.yml Outdated
snapshot_lvm_mount_options: "{{ snapshot_lvm_mount_options | d(omit) }}"
snapshot_lvm_fstype: "{{ snapshot_lvm_fstype | d(omit) }}"
snapshot_lvm_mountpoint: "{{ snapshot_lvm_mountpoint | d(omit) }}"
snapshot_lvm_set: "{{ snapshot_lvm_set | to_json }}"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richm is there a better way to pass snapshot_lvm_set?

I'd like it to be None if it is not set, but the way I've done it here I'm getting "{}\0" when I read it in the python.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then if you define snapshot_lvm_set as a list of dict in the module, you can just use

        snapshot_lvm_set: "{{ snapshot_lvm_set | d(omit) }}"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, on the module side, I will remove the code that expects JSON and use dict()?

When I define it as you suggested, the module gets snapshot_lvm_set set to a string, right? But the string is no longer JSON (the values have single quotes).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, on the module side, I will remove the code that expects JSON and use dict()?

yes

When I define it as you suggested, the module gets snapshot_lvm_set set to a string, right?

No, the module should get snapshot_lvm_set as a dict.

But the string is no longer JSON (the values have single quotes).

in defaults/main.yml you should have

snapshot_lvm_set: {}

snapshot_lvm_snapset_name=dict(type="str"),
snapshot_lvm_mount_options=dict(type="str"),
snapshot_lvm_mountpoint=dict(type="str"),
snapshot_lvm_set=dict(type="str"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a list of dict - see the storage role for examples e.g. https://github.com/linux-system-roles/storage/blob/main/library/blivet.py#L2131

        snapshot_lvm_set=dict(type="list", elements="dict",
                                                  options=dict(name=dict(type="str"),
                                                                         vg=dict(type="str"),
                                                                         lv=dict(type="str"),
                                                                         percent_space_required=dict(type="int"),
....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - not a list of dict but just a dict

@richm
Copy link
Contributor

richm commented Jun 10, 2024

ping - any updates?

@@ -14,7 +14,7 @@
__snapshot_failed_params.get('snapshot_lvm_percent_space_required')
}}"
snapshot_lvm_all_vgs: "{{
__snapshot_failed_params.get('snapshot_all')
__snapshot_failed_params.get('snapshot_lvm_all_vgs') | d(false)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richm how should I default this to false when snapshot_lvm_all_vgs is not defined. Currently it defaults to "" and that causes an error when the module expects it to be a boolean.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
__snapshot_failed_params.get('snapshot_lvm_all_vgs') | d(false)
__snapshot_failed_params.get('snapshot_lvm_all_vgs', false)

verbosity: 2

- name: Parse raw output
- name: Set result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to do this - I believe the register: snapshot_cmd in the snapshot task creates the global variable snapshot_cmd

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok - fixed.

@richm
Copy link
Contributor

richm commented Jun 18, 2024

@trgill This PR should fix all of those ansible-test (and flake8, pylint, black) issues. trgill#6 I didn't test it but hopefully the errors will be obvious

A big change was using module.run_command - which means most of the methods required the addition of the module argument - the way other modules handle this is by declaring a module class e.g. SnapshotModule(AnsibleModule) and putting all of the methods in this class - then you can just use self.module.run_command everywhere instead of having to pass module as a method argument.

@richm
Copy link
Contributor

richm commented Jun 18, 2024

[citest]

@trgill
Copy link
Collaborator Author

trgill commented Jun 19, 2024

[citest bad]

@trgill
Copy link
Collaborator Author

trgill commented Jul 2, 2024

[citest bad]

@trgill
Copy link
Collaborator Author

trgill commented Jul 2, 2024

@richm I rebased on your changes, will that include the fixes needed?

@richm
Copy link
Contributor

richm commented Jul 2, 2024

@richm I rebased on your changes, will that include the fixes needed?

You'll also need to merge this PR - trgill#7

@richm
Copy link
Contributor

richm commented Jul 2, 2024

[citest] - note that centos7 tests will fail due to the move to vault.centos.org

thin_pool_size: '3g'
size: "1g"
thin: true
- name: lv2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it enough to test with 1 thin pool? There seems to be a problem using multiple thin pools with the storage subsystem in el7 - perhaps something to do with blivet on el7? I get the following error:

Failed to commit changes to disk: Process reported exit code 3:   --virtualsize may not be zero.
  Run `lvcreate --help' for more information.

@vojtechtrefny @japokorn any ideas? The storage role thin pool test uses only 1 volume - https://github.com/linux-system-roles/storage/blob/main/tests/tests_create_thinp_then_remove.yml#L37-L43 - which is why it passes on el7

If you really want to test with multiple volumes, then we'll need to define one list of volumes for el7, and another list for el8 and later.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I'll update the test to only use 1 pool.

I'll test multiple pools manually.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vojtechtrefny @japokorn it looks like if I add:

thin_pool_size: '3g'

whenever I allocate a volume that uses the thin pool it works as expected on RHEL 7 and Centos 7.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it enough to test with 1 thin pool? There seems to be a problem using multiple thin pools with the storage subsystem in el7 - perhaps something to do with blivet on el7?

We have quite an old version of blivet on EL7 so there could be a lot of bugs that were fixed between 7 and 8, but I don't remember anything specific about thin provisioning that could be causing this.

The storage role thin pool test uses only 1 volume - https://github.com/linux-system-roles/storage/blob/main/tests/tests_create_thinp_then_remove.yml#L37-L43 - which is why it passes on el7

Looks like something we should add to our test suite.

it looks like if I add: thin_pool_size: '3g' whenever I allocate a volume that uses the thin pool it works as expected on RHEL 7 and Centos 7.

We have some logic in the storage role to calculate size of logical volumes and thin pools when not specified here so it's possible this is somehow broken with older blivet. Can you please check @japokorn?

I'll try to test this scenario with the storage role and I'll report an upstream issue for us to track this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the problem was actually invalid size for the second LV:

          - name: lv2
            thin_pool_name: thin_pool
            size: "15"

size: "15" is interpreted as 15 B and on CentOS 7 this is rounded down to 0 KiB so that's why lvcreate is complaining about --virtualsize may not be zero. With newer blivet on Fedora and CentOS 9/10 we automatically adjust this size to the minimal size which explains why it was failing only on CentOS 7.

@trgill
Copy link
Collaborator Author

trgill commented Jul 8, 2024

[citest]

@trgill
Copy link
Collaborator Author

trgill commented Jul 9, 2024

[citest bad]

@richm
Copy link
Contributor

richm commented Jul 9, 2024

@trgill centos-7 testing is broken currently - it is being worked on - the only way to test that is locally with tox-lsr

@trgill
Copy link
Collaborator Author

trgill commented Jul 9, 2024

@richm Ok - I'll run centos-7 tests locally

@richm
Copy link
Contributor

richm commented Jul 9, 2024

In the latest commit, I still get errors with the thin pool tests on centos-7:

tests/tests_set_thin_basic.yml - Failed to commit changes to disk: (FSError('format failed: 1',), '/dev/mapper/test_vg2-lv7')

and tests_thin_basic.yml doesn't have thin_pool_size for all of the test volumes

trgill and others added 5 commits July 10, 2024 09:46
Update to ignore thinpool LVs and support think provisioned sources.

Signed-off-by: Todd Gill <[email protected]>
it returns rc, is_snapshot and should not be called as a boolean
function.

Signed-off-by: Todd Gill <[email protected]>
@trgill
Copy link
Collaborator Author

trgill commented Jul 10, 2024

I updated the thin tests to only use 1 source VG. I tested locally on centos-7 and it works.

@trgill
Copy link
Collaborator Author

trgill commented Jul 11, 2024

@richm I think this is ready to merge?

@richm
Copy link
Contributor

richm commented Jul 13, 2024

[citest]

@trgill
Copy link
Collaborator Author

trgill commented Jul 15, 2024

[citest bad]

@richm
Copy link
Contributor

richm commented Jul 15, 2024

not sure what the problem is with centos-9, but tests pass locally. centos-7 passes locally also.

@richm richm merged commit eff0840 into linux-system-roles:main Jul 15, 2024
22 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants